Dataset statistics
| Number of variables | 17 |
|---|---|
| Number of observations | 46428 |
| Missing cells | 18400 |
| Missing cells (%) | 2.3% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 6.0 MiB |
| Average record size in memory | 136.0 B |
Variable types
| Numeric | 11 |
|---|---|
| Categorical | 5 |
| DateTime | 1 |
name has a high cardinality: 45489 distinct values | High cardinality |
host_name has a high cardinality: 11081 distinct values | High cardinality |
neighbourhood has a high cardinality: 219 distinct values | High cardinality |
df_index is highly correlated with id | High correlation |
id is highly correlated with df_index | High correlation |
last_review has 9182 (19.8%) missing values | Missing |
reviews_per_month has 9182 (19.8%) missing values | Missing |
minimum_nights is highly skewed (γ1 = 21.79076237) | Skewed |
name is uniformly distributed | Uniform |
df_index has unique values | Unique |
id has unique values | Unique |
number_of_reviews has 9182 (19.8%) zeros | Zeros |
availability_365 has 17005 (36.6%) zeros | Zeros |
Reproduction
| Analysis started | 2023-04-27 07:28:23.935200 |
|---|---|
| Analysis finished | 2023-04-27 07:28:39.757893 |
| Duration | 15.82 seconds |
| Software version | pandas-profiling v2.11.0 |
| Download configuration | config.yaml |
| Distinct | 46428 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 24313.23591 |
|---|---|
| Minimum | 0 |
| Maximum | 48894 |
| Zeros | 1 |
| Zeros (%) | < 0.1% |
| Memory size | 362.8 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 2428.35 |
| Q1 | 12176.75 |
| median | 24264.5 |
| Q3 | 36383.25 |
| 95-th percentile | 46373.3 |
| Maximum | 48894 |
| Range | 48894 |
| Interquartile range (IQR) | 24206.5 |
Descriptive statistics
| Standard deviation | 14042.69692 |
|---|---|
| Coefficient of variation (CV) | 0.5775741643 |
| Kurtosis | -1.188349436 |
| Mean | 24313.23591 |
| Median Absolute Deviation (MAD) | 12103.5 |
| Skewness | 0.009876485015 |
| Sum | 1128814917 |
| Variance | 197197336.6 |
| Monotocity | Strictly increasing |
| Value | Count | Frequency (%) |
| 0 | 1 | < 0.1% |
| 39558 | 1 | < 0.1% |
| 10896 | 1 | < 0.1% |
| 8849 | 1 | < 0.1% |
| 14994 | 1 | < 0.1% |
| 12947 | 1 | < 0.1% |
| 2708 | 1 | < 0.1% |
| 6806 | 1 | < 0.1% |
| 4759 | 1 | < 0.1% |
| 27288 | 1 | < 0.1% |
| Other values (46418) | 46418 |
| Value | Count | Frequency (%) |
| 0 | 1 | |
| 1 | 1 | |
| 2 | 1 | |
| 3 | 1 | |
| 4 | 1 |
| Value | Count | Frequency (%) |
| 48894 | 1 | |
| 48893 | 1 | |
| 48892 | 1 | |
| 48891 | 1 | |
| 48890 | 1 |
| Distinct | 46428 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 18918078.01 |
|---|---|
| Minimum | 2539 |
| Maximum | 36487245 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 362.8 KiB |
Quantile statistics
| Minimum | 2539 |
|---|---|
| 5-th percentile | 1209898.9 |
| Q1 | 9445461.25 |
| median | 19544622 |
| Q3 | 28937773.5 |
| 95-th percentile | 35225452.3 |
| Maximum | 36487245 |
| Range | 36484706 |
| Interquartile range (IQR) | 19492312.25 |
Descriptive statistics
| Standard deviation | 10931202.2 |
|---|---|
| Coefficient of variation (CV) | 0.5778177992 |
| Kurtosis | -1.219101857 |
| Mean | 18918078.01 |
| Median Absolute Deviation (MAD) | 9799792.5 |
| Skewness | -0.08074382881 |
| Sum | 8.78328526 × 1011 |
| Variance | 1.194911816 × 1014 |
| Monotocity | Strictly increasing |
| Value | Count | Frequency (%) |
| 32493112 | 1 | < 0.1% |
| 19274358 | 1 | < 0.1% |
| 20015736 | 1 | < 0.1% |
| 4809337 | 1 | < 0.1% |
| 16929407 | 1 | < 0.1% |
| 5941888 | 1 | < 0.1% |
| 31337900 | 1 | < 0.1% |
| 15970947 | 1 | < 0.1% |
| 22929141 | 1 | < 0.1% |
| 28842166 | 1 | < 0.1% |
| Other values (46418) | 46418 |
| Value | Count | Frequency (%) |
| 2539 | 1 | |
| 2595 | 1 | |
| 3647 | 1 | |
| 3831 | 1 | |
| 5022 | 1 |
| Value | Count | Frequency (%) |
| 36487245 | 1 | |
| 36485609 | 1 | |
| 36485431 | 1 | |
| 36485057 | 1 | |
| 36484665 | 1 |
| Distinct | 45489 |
|---|---|
| Distinct (%) | 98.0% |
| Missing | 15 |
| Missing (%) | < 0.1% |
| Memory size | 362.8 KiB |
| Hillside Hotel | 18 |
|---|---|
| Home away from home | 17 |
| New york Multi-unit building | 13 |
| Brooklyn Apartment | 12 |
| Loft Suite @ The Box House Hotel | 11 |
| Other values (45484) |
Length
| Max length | 179 |
|---|---|
| Median length | 36 |
| Mean length | 36.76734966 |
| Min length | 1 |
Characters and Unicode
| Total characters | 1706483 |
|---|---|
| Distinct characters | 768 |
| Distinct categories | 20 ? |
| Distinct scripts | 11 ? |
| Distinct blocks | 17 ? |
Unique
| Unique | 44878 ? |
|---|---|
| Unique (%) | 96.7% |
Sample
| 1st row | Clean & quiet apt home by the park |
|---|---|
| 2nd row | Skylit Midtown Castle |
| 3rd row | THE VILLAGE OF HARLEM....NEW YORK ! |
| 4th row | Cozy Entire Floor of Brownstone |
| 5th row | Entire Apt: Spacious Studio/Loft by central park |
| Value | Count | Frequency (%) |
| Hillside Hotel | 18 | < 0.1% |
| Home away from home | 17 | < 0.1% |
| New york Multi-unit building | 13 | < 0.1% |
| Brooklyn Apartment | 12 | < 0.1% |
| Loft Suite @ The Box House Hotel | 11 | < 0.1% |
| Private Room | 11 | < 0.1% |
| Private room | 10 | < 0.1% |
| Artsy Private BR in Fort Greene Cumberland | 10 | < 0.1% |
| Private room in Brooklyn | 8 | < 0.1% |
| Private room in Williamsburg | 8 | < 0.1% |
| Other values (45479) | 46295 | |
| (Missing) | 15 | < 0.1% |
| Value | Count | Frequency (%) |
| in | 16203 | 5.7% |
| room | 9976 | 3.5% |
| 7815 | 2.8% | |
| bedroom | 7283 | 2.6% |
| private | 7022 | 2.5% |
| apartment | 6461 | 2.3% |
| cozy | 4946 | 1.8% |
| apt | 4410 | 1.6% |
| brooklyn | 3918 | 1.4% |
| studio | 3897 | 1.4% |
| Other values (11921) | 210490 |
Most occurring characters
| Value | Count | Frequency (%) |
| 237569 | 13.9% | |
| e | 117726 | 6.9% |
| o | 116911 | 6.9% |
| t | 100052 | 5.9% |
| a | 98692 | 5.8% |
| r | 93300 | 5.5% |
| i | 90377 | 5.3% |
| n | 90086 | 5.3% |
| l | 48969 | 2.9% |
| m | 47334 | 2.8% |
| Other values (758) | 665467 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 1146598 | |
| Uppercase Letter | 251883 | 14.8% |
| Space Separator | 237573 | 13.9% |
| Other Punctuation | 31772 | 1.9% |
| Decimal Number | 22880 | 1.3% |
| Dash Punctuation | 6440 | 0.4% |
| Other Letter | 2536 | 0.1% |
| Math Symbol | 2412 | 0.1% |
| Close Punctuation | 1470 | 0.1% |
| Open Punctuation | 1331 | 0.1% |
| Other values (10) | 1588 | 0.1% |
Most frequent character per category
| Value | Count | Frequency (%) |
| 房 | 81 | 3.2% |
| 家 | 46 | 1.8% |
| 中 | 44 | 1.7% |
| 间 | 41 | 1.6% |
| 的 | 38 | 1.5% |
| 拉 | 36 | 1.4% |
| 法 | 35 | 1.4% |
| 盛 | 35 | 1.4% |
| 约 | 29 | 1.1% |
| 大 | 29 | 1.1% |
| Other values (518) | 2122 |
| Value | Count | Frequency (%) |
| e | 117726 | 10.3% |
| o | 116911 | 10.2% |
| t | 100052 | 8.7% |
| a | 98692 | 8.6% |
| r | 93300 | 8.1% |
| i | 90377 | 7.9% |
| n | 90086 | 7.9% |
| l | 48969 | 4.3% |
| m | 47334 | 4.1% |
| s | 45537 | 4.0% |
| Other values (57) | 297614 |
| Value | Count | Frequency (%) |
| ★ | 212 | |
| ❤ | 155 | |
| ☆ | 105 | |
| ♥ | 37 | 4.7% |
| ⭐ | 34 | 4.3% |
| ✨ | 30 | 3.8% |
| ❥ | 25 | 3.2% |
| ✿ | 15 | 1.9% |
| ☀ | 15 | 1.9% |
| ✰ | 14 | 1.8% |
| Other values (45) | 142 |
| Value | Count | Frequency (%) |
| B | 28028 | 11.1% |
| S | 24719 | 9.8% |
| C | 19963 | 7.9% |
| A | 18361 | 7.3% |
| R | 16833 | 6.7% |
| P | 13884 | 5.5% |
| E | 13057 | 5.2% |
| L | 12914 | 5.1% |
| M | 11116 | 4.4% |
| N | 10812 | 4.3% |
| Other values (33) | 82196 |
| Value | Count | Frequency (%) |
| , | 8742 | |
| ! | 7439 | |
| / | 4768 | |
| . | 4151 | |
| & | 3043 | 9.6% |
| ' | 1026 | 3.2% |
| * | 835 | 2.6% |
| : | 552 | 1.7% |
| # | 501 | 1.6% |
| " | 278 | 0.9% |
| Other values (11) | 437 | 1.4% |
| Value | Count | Frequency (%) |
| + | 1223 | |
| | | 858 | |
| ~ | 247 | 10.2% |
| = | 32 | 1.3% |
| > | 19 | 0.8% |
| < | 19 | 0.8% |
| → | 6 | 0.2% |
| ⋆ | 4 | 0.2% |
| √ | 2 | 0.1% |
| × | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 1 | 8327 | |
| 2 | 6019 | |
| 3 | 2159 | 9.4% |
| 5 | 1919 | 8.4% |
| 0 | 1870 | 8.2% |
| 4 | 1080 | 4.7% |
| 6 | 503 | 2.2% |
| 7 | 408 | 1.8% |
| 8 | 366 | 1.6% |
| 9 | 229 | 1.0% |
| Value | Count | Frequency (%) |
| ( | 1278 | |
| [ | 35 | 2.6% |
| { | 8 | 0.6% |
| 【 | 8 | 0.6% |
| 《 | 2 | 0.2% |
| Value | Count | Frequency (%) |
| ) | 1416 | |
| ] | 36 | 2.4% |
| } | 8 | 0.5% |
| 】 | 8 | 0.5% |
| 》 | 2 | 0.1% |
| Value | Count | Frequency (%) |
| - | 6374 | |
| — | 41 | 0.6% |
| – | 24 | 0.4% |
| ― | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| ^ | 9 | |
| ` | 4 | |
| ´ | 3 | 18.8% |
| Value | Count | Frequency (%) |
| ゙ | 21 | |
| ー | 11 | |
| ゚ | 5 | 13.5% |
| Value | Count | Frequency (%) |
| 237569 | ||
| 4 | < 0.1% |
| Value | Count | Frequency (%) |
| ’ | 188 | |
| ” | 37 | 16.4% |
| Value | Count | Frequency (%) |
| ️ | 150 | |
| ︎ | 14 | 8.5% |
| Value | Count | Frequency (%) |
| _ | 42 | |
| ‿ | 1 | 2.3% |
| Value | Count | Frequency (%) |
| “ | 37 | |
| ‘ | 8 | 17.8% |
| Value | Count | Frequency (%) |
| $ | 86 |
| Value | Count | Frequency (%) |
| ² | 7 |
| Value | Count | Frequency (%) |
| 181 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 1398278 | |
| Common | 305302 | 17.9% |
| Han | 2226 | 0.1% |
| Cyrillic | 191 | < 0.1% |
| Inherited | 164 | < 0.1% |
| Katakana | 136 | < 0.1% |
| Hiragana | 70 | < 0.1% |
| Hangul | 70 | < 0.1% |
| Hebrew | 31 | < 0.1% |
| Georgian | 13 | < 0.1% |
Most frequent character per script
| Value | Count | Frequency (%) |
| 房 | 81 | 3.6% |
| 家 | 46 | 2.1% |
| 中 | 44 | 2.0% |
| 间 | 41 | 1.8% |
| 的 | 38 | 1.7% |
| 拉 | 36 | 1.6% |
| 法 | 35 | 1.6% |
| 盛 | 35 | 1.6% |
| 约 | 29 | 1.3% |
| 大 | 29 | 1.3% |
| Other values (399) | 1812 |
| Value | Count | Frequency (%) |
| 237569 | ||
| , | 8742 | 2.9% |
| 1 | 8327 | 2.7% |
| ! | 7439 | 2.4% |
| - | 6374 | 2.1% |
| 2 | 6019 | 2.0% |
| / | 4768 | 1.6% |
| . | 4151 | 1.4% |
| & | 3043 | 1.0% |
| 3 | 2159 | 0.7% |
| Other values (118) | 16711 | 5.5% |
| Value | Count | Frequency (%) |
| e | 117726 | 8.4% |
| o | 116911 | 8.4% |
| t | 100052 | 7.2% |
| a | 98692 | 7.1% |
| r | 93300 | 6.7% |
| i | 90377 | 6.5% |
| n | 90086 | 6.4% |
| l | 48969 | 3.5% |
| m | 47334 | 3.4% |
| s | 45537 | 3.3% |
| Other values (67) | 549294 |
| Value | Count | Frequency (%) |
| 한 | 7 | 10.0% |
| 웃 | 3 | 4.3% |
| 성 | 3 | 4.3% |
| 스 | 2 | 2.9% |
| 리 | 2 | 2.9% |
| 고 | 2 | 2.9% |
| 맨 | 2 | 2.9% |
| 하 | 2 | 2.9% |
| 소 | 2 | 2.9% |
| 따 | 2 | 2.9% |
| Other values (38) | 43 |
| Value | Count | Frequency (%) |
| а | 26 | |
| о | 18 | 9.4% |
| т | 17 | 8.9% |
| н | 15 | 7.9% |
| е | 13 | 6.8% |
| к | 11 | 5.8% |
| р | 11 | 5.8% |
| м | 10 | 5.2% |
| с | 9 | 4.7% |
| в | 9 | 4.7% |
| Other values (23) | 52 |
| Value | Count | Frequency (%) |
| ン | 14 | 10.3% |
| ク | 12 | 8.8% |
| リ | 10 | 7.4% |
| ハ | 9 | 6.6% |
| ッ | 9 | 6.6% |
| ア | 9 | 6.6% |
| ス | 8 | 5.9% |
| ト | 7 | 5.1% |
| フ | 6 | 4.4% |
| ウ | 6 | 4.4% |
| Other values (22) | 46 |
| Value | Count | Frequency (%) |
| の | 16 | |
| で | 7 | |
| か | 7 | |
| ら | 6 | 8.6% |
| お | 5 | 7.1% |
| い | 4 | 5.7% |
| な | 4 | 5.7% |
| に | 3 | 4.3% |
| き | 2 | 2.9% |
| く | 2 | 2.9% |
| Other values (13) | 14 |
| Value | Count | Frequency (%) |
| י | 5 | |
| ו | 5 | |
| ב | 4 | |
| ר | 4 | |
| ע | 2 | 6.5% |
| ת | 2 | 6.5% |
| ה | 2 | 6.5% |
| ד | 1 | 3.2% |
| ש | 1 | 3.2% |
| ל | 1 | 3.2% |
| Other values (4) | 4 |
| Value | Count | Frequency (%) |
| ️ | 150 | |
| ︎ | 14 | 8.5% |
| Value | Count | Frequency (%) |
| ღ | 13 |
| Value | Count | Frequency (%) |
| ॐ | 2 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 1702145 | |
| CJK | 2226 | 0.1% |
| Misc Symbols | 433 | < 0.1% |
| None | 423 | < 0.1% |
| Punctuation | 396 | < 0.1% |
| Dingbats | 297 | < 0.1% |
| Cyrillic | 191 | < 0.1% |
| VS | 164 | < 0.1% |
| Hiragana | 70 | < 0.1% |
| Hangul | 70 | < 0.1% |
| Other values (7) | 68 | < 0.1% |
Most frequent character per block
| Value | Count | Frequency (%) |
| 237569 | 14.0% | |
| e | 117726 | 6.9% |
| o | 116911 | 6.9% |
| t | 100052 | 5.9% |
| a | 98692 | 5.8% |
| r | 93300 | 5.5% |
| i | 90377 | 5.3% |
| n | 90086 | 5.3% |
| l | 48969 | 2.9% |
| m | 47334 | 2.8% |
| Other values (86) | 661129 |
| Value | Count | Frequency (%) |
| ’ | 188 | |
| • | 59 | 14.9% |
| — | 41 | 10.4% |
| “ | 37 | 9.3% |
| ” | 37 | 9.3% |
| – | 24 | 6.1% |
| ‘ | 8 | 2.0% |
| ― | 1 | 0.3% |
| ‿ | 1 | 0.3% |
| Value | Count | Frequency (%) |
| ⭐ | 34 | 8.0% |
| à | 27 | 6.4% |
| ó | 24 | 5.7% |
| ゙ | 21 | 5.0% |
| é | 15 | 3.5% |
| 。 | 15 | 3.5% |
| ン | 14 | 3.3% |
| · | 13 | 3.1% |
| ク | 12 | 2.8% |
| ー | 11 | 2.6% |
| Other values (69) | 237 |
| Value | Count | Frequency (%) |
| ️ | 150 | |
| ︎ | 14 | 8.5% |
| Value | Count | Frequency (%) |
| ★ | 212 | |
| ☆ | 105 | |
| ♥ | 37 | 8.5% |
| ☀ | 15 | 3.5% |
| ♀ | 11 | 2.5% |
| ♡ | 6 | 1.4% |
| ♛ | 6 | 1.4% |
| ⚓ | 6 | 1.4% |
| ⚡ | 6 | 1.4% |
| ⚜ | 4 | 0.9% |
| Other values (10) | 25 | 5.8% |
| Value | Count | Frequency (%) |
| ❤ | 155 | |
| ✨ | 30 | 10.1% |
| ❥ | 25 | 8.4% |
| ✿ | 15 | 5.1% |
| ✰ | 14 | 4.7% |
| ✴ | 11 | 3.7% |
| ✪ | 8 | 2.7% |
| ➡ | 5 | 1.7% |
| ✌ | 5 | 1.7% |
| ✦ | 4 | 1.3% |
| Other values (11) | 25 | 8.4% |
| Value | Count | Frequency (%) |
| ™ | 1 |
| Value | Count | Frequency (%) |
| 房 | 81 | 3.6% |
| 家 | 46 | 2.1% |
| 中 | 44 | 2.0% |
| 间 | 41 | 1.8% |
| 的 | 38 | 1.7% |
| 拉 | 36 | 1.6% |
| 法 | 35 | 1.6% |
| 盛 | 35 | 1.6% |
| 约 | 29 | 1.3% |
| 大 | 29 | 1.3% |
| Other values (399) | 1812 |
| Value | Count | Frequency (%) |
| の | 16 | |
| で | 7 | |
| か | 7 | |
| ら | 6 | 8.6% |
| お | 5 | 7.1% |
| い | 4 | 5.7% |
| な | 4 | 5.7% |
| に | 3 | 4.3% |
| き | 2 | 2.9% |
| く | 2 | 2.9% |
| Other values (13) | 14 |
| Value | Count | Frequency (%) |
| а | 26 | |
| о | 18 | 9.4% |
| т | 17 | 8.9% |
| н | 15 | 7.9% |
| е | 13 | 6.8% |
| к | 11 | 5.8% |
| р | 11 | 5.8% |
| м | 10 | 5.2% |
| с | 9 | 4.7% |
| в | 9 | 4.7% |
| Other values (23) | 52 |
| Value | Count | Frequency (%) |
| י | 5 | |
| ו | 5 | |
| ב | 4 | |
| ר | 4 | |
| ע | 2 | 6.5% |
| ת | 2 | 6.5% |
| ה | 2 | 6.5% |
| ד | 1 | 3.2% |
| ש | 1 | 3.2% |
| ל | 1 | 3.2% |
| Other values (4) | 4 |
| Value | Count | Frequency (%) |
| ღ | 13 |
| Value | Count | Frequency (%) |
| 한 | 7 | 10.0% |
| 웃 | 3 | 4.3% |
| 성 | 3 | 4.3% |
| 스 | 2 | 2.9% |
| 리 | 2 | 2.9% |
| 고 | 2 | 2.9% |
| 맨 | 2 | 2.9% |
| 하 | 2 | 2.9% |
| 소 | 2 | 2.9% |
| 따 | 2 | 2.9% |
| Other values (38) | 43 |
| Value | Count | Frequency (%) |
| ▲ | 4 | |
| ◔ | 2 | |
| △ | 2 | |
| ◈ | 2 | |
| ▶ | 1 | 9.1% |
| Value | Count | Frequency (%) |
| ॐ | 2 |
| Value | Count | Frequency (%) |
| ⋆ | 4 | |
| √ | 2 | |
| ⊹ | 1 | 14.3% |
| Value | Count | Frequency (%) |
| ⏩ | 1 | |
| ⏪ | 1 | |
| ⌚ | 1 |
host_id
Real number (ℝ≥0)
| Distinct | 35770 |
|---|---|
| Distinct (%) | 77.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 66451005.18 |
|---|---|
| Minimum | 2438 |
| Maximum | 274321313 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 362.8 KiB |
Quantile statistics
| Minimum | 2438 |
|---|---|
| 5-th percentile | 807362.4 |
| Q1 | 7719136.25 |
| median | 30321517.5 |
| Q3 | 105640471 |
| 95-th percentile | 238903945.7 |
| Maximum | 274321313 |
| Range | 274318875 |
| Interquartile range (IQR) | 97921334.75 |
Descriptive statistics
| Standard deviation | 77691272.84 |
|---|---|
| Coefficient of variation (CV) | 1.169151206 |
| Kurtosis | 0.2656304287 |
| Mean | 66451005.18 |
| Median Absolute Deviation (MAD) | 27027079.5 |
| Skewness | 1.235988162 |
| Sum | 3.085187268 × 1012 |
| Variance | 6.035933876 × 1015 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 219517861 | 272 | 0.6% |
| 107434423 | 192 | 0.4% |
| 137358866 | 103 | 0.2% |
| 30283594 | 98 | 0.2% |
| 12243051 | 95 | 0.2% |
| 16098958 | 91 | 0.2% |
| 61391963 | 91 | 0.2% |
| 22541573 | 87 | 0.2% |
| 1475015 | 52 | 0.1% |
| 7503643 | 52 | 0.1% |
| Other values (35760) | 45295 |
| Value | Count | Frequency (%) |
| 2438 | 1 | < 0.1% |
| 2571 | 1 | < 0.1% |
| 2787 | 6 | |
| 2845 | 2 | < 0.1% |
| 2868 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 274321313 | 1 | |
| 274311461 | 1 | |
| 274307600 | 1 | |
| 274298453 | 1 | |
| 274273284 | 1 |
| Distinct | 11081 |
|---|---|
| Distinct (%) | 23.9% |
| Missing | 21 |
| Missing (%) | < 0.1% |
| Memory size | 362.8 KiB |
| Michael | 395 |
|---|---|
| David | 375 |
| John | 279 |
| Sonder (NYC) | 272 |
| Alex | 260 |
| Other values (11076) |
Length
| Max length | 35 |
|---|---|
| Median length | 6 |
| Mean length | 6.109595535 |
| Min length | 1 |
Characters and Unicode
| Total characters | 283528 |
|---|---|
| Distinct characters | 199 |
| Distinct categories | 14 ? |
| Distinct scripts | 7 ? |
| Distinct blocks | 8 ? |
Unique
| Unique | 6646 ? |
|---|---|
| Unique (%) | 14.3% |
Sample
| 1st row | John |
|---|---|
| 2nd row | Jennifer |
| 3rd row | Elisabeth |
| 4th row | LisaRoxanne |
| 5th row | Laura |
| Value | Count | Frequency (%) |
| Michael | 395 | 0.9% |
| David | 375 | 0.8% |
| John | 279 | 0.6% |
| Sonder (NYC) | 272 | 0.6% |
| Alex | 260 | 0.6% |
| Sarah | 221 | 0.5% |
| Daniel | 217 | 0.5% |
| Maria | 199 | 0.4% |
| Blueground | 192 | 0.4% |
| Jessica | 188 | 0.4% |
| Other values (11071) | 43809 |
| Value | Count | Frequency (%) |
| 1055 | 2.0% | |
| and | 589 | 1.1% |
| michael | 438 | 0.8% |
| david | 416 | 0.8% |
| sonder | 367 | 0.7% |
| john | 319 | 0.6% |
| alex | 309 | 0.6% |
| laura | 284 | 0.5% |
| nyc | 282 | 0.5% |
| maria | 234 | 0.5% |
| Other values (9966) | 47391 |
Most occurring characters
| Value | Count | Frequency (%) |
| a | 36141 | 12.7% |
| e | 27173 | 9.6% |
| i | 23124 | 8.2% |
| n | 22810 | 8.0% |
| r | 16949 | 6.0% |
| l | 14519 | 5.1% |
| o | 12101 | 4.3% |
| t | 8937 | 3.2% |
| s | 8673 | 3.1% |
| h | 8593 | 3.0% |
| Other values (189) | 104508 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 223752 | |
| Uppercase Letter | 51840 | 18.3% |
| Space Separator | 5372 | 1.9% |
| Other Punctuation | 1509 | 0.5% |
| Open Punctuation | 324 | 0.1% |
| Close Punctuation | 322 | 0.1% |
| Dash Punctuation | 198 | 0.1% |
| Other Letter | 106 | < 0.1% |
| Decimal Number | 69 | < 0.1% |
| Math Symbol | 30 | < 0.1% |
| Other values (4) | 6 | < 0.1% |
Most frequent character per category
| Value | Count | Frequency (%) |
| 明 | 6 | 5.7% |
| 青 | 5 | 4.7% |
| 美 | 5 | 4.7% |
| 德 | 5 | 4.7% |
| 文 | 4 | 3.8% |
| 常 | 3 | 2.8% |
| 春 | 3 | 2.8% |
| 铀 | 3 | 2.8% |
| 正 | 2 | 1.9% |
| 川 | 2 | 1.9% |
| Other values (58) | 68 |
| Value | Count | Frequency (%) |
| a | 36141 | |
| e | 27173 | |
| i | 23124 | |
| n | 22810 | |
| r | 16949 | 7.6% |
| l | 14519 | 6.5% |
| o | 12101 | 5.4% |
| t | 8937 | 4.0% |
| s | 8673 | 3.9% |
| h | 8593 | 3.8% |
| Other values (54) | 44732 |
| Value | Count | Frequency (%) |
| A | 6068 | |
| J | 5146 | 9.9% |
| M | 5070 | 9.8% |
| S | 4477 | 8.6% |
| C | 3531 | 6.8% |
| L | 2760 | 5.3% |
| D | 2616 | 5.0% |
| K | 2511 | 4.8% |
| R | 2374 | 4.6% |
| E | 2270 | 4.4% |
| Other values (28) | 15017 |
| Value | Count | Frequency (%) |
| 5 | 16 | |
| 7 | 12 | |
| 0 | 11 | |
| 2 | 8 | |
| 4 | 7 | |
| 1 | 6 | 8.7% |
| 3 | 3 | 4.3% |
| 6 | 3 | 4.3% |
| 8 | 2 | 2.9% |
| 9 | 1 | 1.4% |
| Value | Count | Frequency (%) |
| & | 1097 | |
| . | 294 | 19.5% |
| / | 39 | 2.6% |
| , | 35 | 2.3% |
| ' | 24 | 1.6% |
| @ | 8 | 0.5% |
| " | 6 | 0.4% |
| ! | 4 | 0.3% |
| : | 2 | 0.1% |
| Value | Count | Frequency (%) |
| 5366 | ||
| 6 | 0.1% |
| Value | Count | Frequency (%) |
| + | 30 |
| Value | Count | Frequency (%) |
| ( | 324 |
| Value | Count | Frequency (%) |
| ) | 322 |
| Value | Count | Frequency (%) |
| - | 198 |
| Value | Count | Frequency (%) |
| | 2 |
| Value | Count | Frequency (%) |
| £ | 1 |
| Value | Count | Frequency (%) |
| _ | 1 |
| Value | Count | Frequency (%) |
| ’ | 2 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 275536 | |
| Common | 7830 | 2.8% |
| Han | 89 | < 0.1% |
| Cyrillic | 56 | < 0.1% |
| Hangul | 9 | < 0.1% |
| Hebrew | 5 | < 0.1% |
| Hiragana | 3 | < 0.1% |
Most frequent character per script
| Value | Count | Frequency (%) |
| a | 36141 | 13.1% |
| e | 27173 | 9.9% |
| i | 23124 | 8.4% |
| n | 22810 | 8.3% |
| r | 16949 | 6.2% |
| l | 14519 | 5.3% |
| o | 12101 | 4.4% |
| t | 8937 | 3.2% |
| s | 8673 | 3.1% |
| h | 8593 | 3.1% |
| Other values (70) | 96516 |
| Value | Count | Frequency (%) |
| 明 | 6 | 6.7% |
| 青 | 5 | 5.6% |
| 美 | 5 | 5.6% |
| 德 | 5 | 5.6% |
| 文 | 4 | 4.5% |
| 常 | 3 | 3.4% |
| 春 | 3 | 3.4% |
| 铀 | 3 | 3.4% |
| 正 | 2 | 2.2% |
| 川 | 2 | 2.2% |
| Other values (43) | 51 |
| Value | Count | Frequency (%) |
| 5366 | ||
| & | 1097 | 14.0% |
| ( | 324 | 4.1% |
| ) | 322 | 4.1% |
| . | 294 | 3.8% |
| - | 198 | 2.5% |
| / | 39 | 0.5% |
| , | 35 | 0.4% |
| + | 30 | 0.4% |
| ' | 24 | 0.3% |
| Other values (19) | 101 | 1.3% |
| Value | Count | Frequency (%) |
| е | 6 | |
| н | 6 | |
| а | 6 | |
| А | 4 | 7.1% |
| л | 4 | 7.1% |
| и | 4 | 7.1% |
| к | 3 | 5.4% |
| с | 3 | 5.4% |
| й | 3 | 5.4% |
| р | 3 | 5.4% |
| Other values (12) | 14 |
| Value | Count | Frequency (%) |
| 소 | 2 | |
| 정 | 2 | |
| 단 | 1 | |
| 비 | 1 | |
| 진 | 1 | |
| 빈 | 1 | |
| 나 | 1 |
| Value | Count | Frequency (%) |
| ד | 1 | |
| נ | 1 | |
| י | 1 | |
| א | 1 | |
| ל | 1 |
| Value | Count | Frequency (%) |
| ゆ | 1 | |
| り | 1 | |
| あ | 1 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 283116 | |
| None | 240 | 0.1% |
| CJK | 89 | < 0.1% |
| Cyrillic | 56 | < 0.1% |
| Punctuation | 10 | < 0.1% |
| Hangul | 9 | < 0.1% |
| Hebrew | 5 | < 0.1% |
| Hiragana | 3 | < 0.1% |
Most frequent character per block
| Value | Count | Frequency (%) |
| a | 36141 | 12.8% |
| e | 27173 | 9.6% |
| i | 23124 | 8.2% |
| n | 22810 | 8.1% |
| r | 16949 | 6.0% |
| l | 14519 | 5.1% |
| o | 12101 | 4.3% |
| t | 8937 | 3.2% |
| s | 8673 | 3.1% |
| h | 8593 | 3.0% |
| Other values (67) | 104096 |
| Value | Count | Frequency (%) |
| é | 104 | |
| í | 23 | 9.6% |
| á | 20 | 8.3% |
| ú | 19 | 7.9% |
| ë | 13 | 5.4% |
| ô | 11 | 4.6% |
| ó | 9 | 3.8% |
| è | 7 | 2.9% |
| ç | 5 | 2.1% |
| ï | 4 | 1.7% |
| Other values (19) | 25 | 10.4% |
| Value | Count | Frequency (%) |
| 明 | 6 | 6.7% |
| 青 | 5 | 5.6% |
| 美 | 5 | 5.6% |
| 德 | 5 | 5.6% |
| 文 | 4 | 4.5% |
| 常 | 3 | 3.4% |
| 春 | 3 | 3.4% |
| 铀 | 3 | 3.4% |
| 正 | 2 | 2.2% |
| 川 | 2 | 2.2% |
| Other values (43) | 51 |
| Value | Count | Frequency (%) |
| 6 | ||
| | 2 | 20.0% |
| ’ | 2 | 20.0% |
| Value | Count | Frequency (%) |
| 소 | 2 | |
| 정 | 2 | |
| 단 | 1 | |
| 비 | 1 | |
| 진 | 1 | |
| 빈 | 1 | |
| 나 | 1 |
| Value | Count | Frequency (%) |
| е | 6 | |
| н | 6 | |
| а | 6 | |
| А | 4 | 7.1% |
| л | 4 | 7.1% |
| и | 4 | 7.1% |
| к | 3 | 5.4% |
| с | 3 | 5.4% |
| й | 3 | 5.4% |
| р | 3 | 5.4% |
| Other values (12) | 14 |
| Value | Count | Frequency (%) |
| ד | 1 | |
| נ | 1 | |
| י | 1 | |
| א | 1 | |
| ל | 1 |
| Value | Count | Frequency (%) |
| ゆ | 1 | |
| り | 1 | |
| あ | 1 |
neighbourhood_group
Categorical
| Distinct | 5 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 362.8 KiB |
| Manhattan | |
|---|---|
| Brooklyn | |
| Queens | |
| Bronx | 1072 |
| Staten Island | 365 |
Length
| Max length | 13 |
|---|---|
| Median length | 8 |
| Mean length | 8.157060395 |
| Min length | 5 |
Characters and Unicode
| Total characters | 378716 |
|---|---|
| Distinct characters | 20 |
| Distinct categories | 3 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Brooklyn |
|---|---|
| 2nd row | Manhattan |
| 3rd row | Manhattan |
| 4th row | Brooklyn |
| 5th row | Manhattan |
| Value | Count | Frequency (%) |
| Manhattan | 19855 | |
| Brooklyn | 19550 | |
| Queens | 5586 | 12.0% |
| Bronx | 1072 | 2.3% |
| Staten Island | 365 | 0.8% |
| Value | Count | Frequency (%) |
| manhattan | 19855 | |
| brooklyn | 19550 | |
| queens | 5586 | 11.9% |
| bronx | 1072 | 2.3% |
| island | 365 | 0.8% |
| staten | 365 | 0.8% |
Most occurring characters
| Value | Count | Frequency (%) |
| n | 66648 | |
| a | 60295 | |
| t | 40440 | |
| o | 40172 | |
| B | 20622 | 5.4% |
| r | 20622 | 5.4% |
| l | 19915 | 5.3% |
| M | 19855 | 5.2% |
| h | 19855 | 5.2% |
| k | 19550 | 5.2% |
| Other values (10) | 50742 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 331558 | |
| Uppercase Letter | 46793 | 12.4% |
| Space Separator | 365 | 0.1% |
Most frequent character per category
| Value | Count | Frequency (%) |
| n | 66648 | |
| a | 60295 | |
| t | 40440 | |
| o | 40172 | |
| r | 20622 | 6.2% |
| l | 19915 | 6.0% |
| h | 19855 | 6.0% |
| k | 19550 | 5.9% |
| y | 19550 | 5.9% |
| e | 11537 | 3.5% |
| Other values (4) | 12974 | 3.9% |
| Value | Count | Frequency (%) |
| B | 20622 | |
| M | 19855 | |
| Q | 5586 | 11.9% |
| S | 365 | 0.8% |
| I | 365 | 0.8% |
| Value | Count | Frequency (%) |
| 365 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 378351 | |
| Common | 365 | 0.1% |
Most frequent character per script
| Value | Count | Frequency (%) |
| n | 66648 | |
| a | 60295 | |
| t | 40440 | |
| o | 40172 | |
| B | 20622 | 5.5% |
| r | 20622 | 5.5% |
| l | 19915 | 5.3% |
| M | 19855 | 5.2% |
| h | 19855 | 5.2% |
| k | 19550 | 5.2% |
| Other values (9) | 50377 |
| Value | Count | Frequency (%) |
| 365 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 378716 |
Most frequent character per block
| Value | Count | Frequency (%) |
| n | 66648 | |
| a | 60295 | |
| t | 40440 | |
| o | 40172 | |
| B | 20622 | 5.4% |
| r | 20622 | 5.4% |
| l | 19915 | 5.3% |
| M | 19855 | 5.2% |
| h | 19855 | 5.2% |
| k | 19550 | 5.2% |
| Other values (10) | 50742 |
| Distinct | 219 |
|---|---|
| Distinct (%) | 0.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 362.8 KiB |
| Williamsburg | |
|---|---|
| Bedford-Stuyvesant | |
| Harlem | 2599 |
| Bushwick | 2442 |
| Upper West Side | 1814 |
| Other values (214) |
Length
| Max length | 26 |
|---|---|
| Median length | 12 |
| Mean length | 11.92599294 |
| Min length | 4 |
Characters and Unicode
| Total characters | 553700 |
|---|---|
| Distinct characters | 54 |
| Distinct categories | 5 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 4 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | Kensington |
|---|---|
| 2nd row | Midtown |
| 3rd row | Harlem |
| 4th row | Clinton Hill |
| 5th row | East Harlem |
| Value | Count | Frequency (%) |
| Williamsburg | 3771 | 8.1% |
| Bedford-Stuyvesant | 3647 | 7.9% |
| Harlem | 2599 | 5.6% |
| Bushwick | 2442 | 5.3% |
| Upper West Side | 1814 | 3.9% |
| Hell's Kitchen | 1769 | 3.8% |
| East Village | 1737 | 3.7% |
| Upper East Side | 1692 | 3.6% |
| Crown Heights | 1528 | 3.3% |
| Midtown | 1211 | 2.6% |
| Other values (209) | 24218 |
| Value | Count | Frequency (%) |
| east | 6298 | 8.4% |
| side | 4376 | 5.8% |
| williamsburg | 3771 | 5.0% |
| harlem | 3693 | 4.9% |
| bedford-stuyvesant | 3647 | 4.9% |
| upper | 3506 | 4.7% |
| heights | 3504 | 4.7% |
| village | 2905 | 3.9% |
| west | 2502 | 3.3% |
| bushwick | 2442 | 3.3% |
| Other values (231) | 38317 |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 50730 | 9.2% |
| i | 39813 | 7.2% |
| s | 37999 | 6.9% |
| t | 36585 | 6.6% |
| a | 35885 | 6.5% |
| l | 32481 | 5.9% |
| r | 32219 | 5.8% |
| 28533 | 5.2% | |
| n | 24857 | 4.5% |
| o | 22938 | 4.1% |
| Other values (44) | 211660 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 439490 | |
| Uppercase Letter | 79604 | 14.4% |
| Space Separator | 28533 | 5.2% |
| Dash Punctuation | 4172 | 0.8% |
| Other Punctuation | 1901 | 0.3% |
Most frequent character per category
| Value | Count | Frequency (%) |
| e | 50730 | |
| i | 39813 | 9.1% |
| s | 37999 | 8.6% |
| t | 36585 | 8.3% |
| a | 35885 | 8.2% |
| l | 32481 | 7.4% |
| r | 32219 | 7.3% |
| n | 24857 | 5.7% |
| o | 22938 | 5.2% |
| d | 18794 | 4.3% |
| Other values (15) | 107189 |
| Value | Count | Frequency (%) |
| H | 11337 | |
| S | 10953 | |
| B | 8190 | |
| W | 7758 | |
| E | 6784 | |
| C | 5040 | 6.3% |
| U | 3567 | 4.5% |
| G | 3554 | 4.5% |
| F | 3112 | 3.9% |
| V | 2947 | 3.7% |
| Other values (14) | 16362 |
| Value | Count | Frequency (%) |
| ' | 1778 | |
| . | 121 | 6.4% |
| , | 2 | 0.1% |
| Value | Count | Frequency (%) |
| 28533 |
| Value | Count | Frequency (%) |
| - | 4172 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 519094 | |
| Common | 34606 | 6.2% |
Most frequent character per script
| Value | Count | Frequency (%) |
| e | 50730 | 9.8% |
| i | 39813 | 7.7% |
| s | 37999 | 7.3% |
| t | 36585 | 7.0% |
| a | 35885 | 6.9% |
| l | 32481 | 6.3% |
| r | 32219 | 6.2% |
| n | 24857 | 4.8% |
| o | 22938 | 4.4% |
| d | 18794 | 3.6% |
| Other values (39) | 186793 |
| Value | Count | Frequency (%) |
| 28533 | ||
| - | 4172 | 12.1% |
| ' | 1778 | 5.1% |
| . | 121 | 0.3% |
| , | 2 | < 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 553700 |
Most frequent character per block
| Value | Count | Frequency (%) |
| e | 50730 | 9.2% |
| i | 39813 | 7.2% |
| s | 37999 | 6.9% |
| t | 36585 | 6.6% |
| a | 35885 | 6.5% |
| l | 32481 | 5.9% |
| r | 32219 | 5.8% |
| 28533 | 5.2% | |
| n | 24857 | 4.5% |
| o | 22938 | 4.1% |
| Other values (44) | 211660 |
latitude
Real number (ℝ≥0)
| Distinct | 18791 |
|---|---|
| Distinct (%) | 40.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 40.72857167 |
|---|---|
| Minimum | 40.49979 |
| Maximum | 40.91306 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 362.8 KiB |
Quantile statistics
| Minimum | 40.49979 |
|---|---|
| 5-th percentile | 40.64538 |
| Q1 | 40.68936 |
| median | 40.72201 |
| Q3 | 40.76333 |
| 95-th percentile | 40.826463 |
| Maximum | 40.91306 |
| Range | 0.41327 |
| Interquartile range (IQR) | 0.07397 |
Descriptive statistics
| Standard deviation | 0.05519047241 |
|---|---|
| Coefficient of variation (CV) | 0.001355079988 |
| Kurtosis | 0.09972499142 |
| Mean | 40.72857167 |
| Median Absolute Deviation (MAD) | 0.03641 |
| Skewness | 0.2583660389 |
| Sum | 1890946.126 |
| Variance | 0.003045988245 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 40.71813 | 18 | < 0.1% |
| 40.69414 | 13 | < 0.1% |
| 40.68444 | 13 | < 0.1% |
| 40.68634 | 13 | < 0.1% |
| 40.71353 | 12 | < 0.1% |
| 40.71171 | 12 | < 0.1% |
| 40.68537 | 12 | < 0.1% |
| 40.76189 | 11 | < 0.1% |
| 40.71923 | 11 | < 0.1% |
| 40.76125 | 11 | < 0.1% |
| Other values (18781) | 46302 |
| Value | Count | Frequency (%) |
| 40.49979 | 1 | |
| 40.50641 | 1 | |
| 40.50708 | 1 | |
| 40.50868 | 1 | |
| 40.50873 | 1 |
| Value | Count | Frequency (%) |
| 40.91306 | 1 | |
| 40.91234 | 1 | |
| 40.91169 | 1 | |
| 40.91167 | 1 | |
| 40.90804 | 1 |
longitude
Real number (ℝ)
| Distinct | 14563 |
|---|---|
| Distinct (%) | 31.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | -73.950968 |
|---|---|
| Minimum | -74.24442 |
| Maximum | -73.71299 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 362.8 KiB |
Quantile statistics
| Minimum | -74.24442 |
|---|---|
| 5-th percentile | -74.00326 |
| Q1 | -73.9821 |
| median | -73.95457 |
| Q3 | -73.9346275 |
| 95-th percentile | -73.86389 |
| Maximum | -73.71299 |
| Range | 0.53143 |
| Interquartile range (IQR) | 0.0474725 |
Descriptive statistics
| Standard deviation | 0.04638583239 |
|---|---|
| Coefficient of variation (CV) | -0.0006272511861 |
| Kurtosis | 4.939479054 |
| Mean | -73.950968 |
| Median Absolute Deviation (MAD) | 0.02488 |
| Skewness | 1.249716322 |
| Sum | -3433395.542 |
| Variance | 0.002151645447 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| -73.95677 | 18 | < 0.1% |
| -73.95427 | 17 | < 0.1% |
| -73.9506 | 16 | < 0.1% |
| -73.94791 | 16 | < 0.1% |
| -73.95136 | 16 | < 0.1% |
| -73.95405 | 16 | < 0.1% |
| -73.95332 | 16 | < 0.1% |
| -73.95725 | 15 | < 0.1% |
| -73.98439 | 15 | < 0.1% |
| -73.94537 | 15 | < 0.1% |
| Other values (14553) | 46268 |
| Value | Count | Frequency (%) |
| -74.24442 | 1 | |
| -74.24285 | 1 | |
| -74.24084 | 1 | |
| -74.23986 | 1 | |
| -74.23914 | 1 |
| Value | Count | Frequency (%) |
| -73.71299 | 1 | |
| -73.7169 | 1 | |
| -73.71795 | 1 | |
| -73.71829 | 1 | |
| -73.71928 | 1 |
room_type
Categorical
| Distinct | 3 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 362.8 KiB |
| Entire home/apt | |
|---|---|
| Private room | |
| Shared room | 1140 |
Length
| Max length | 15 |
|---|---|
| Median length | 15 |
| Mean length | 13.47790127 |
| Min length | 11 |
Characters and Unicode
| Total characters | 625752 |
|---|---|
| Distinct characters | 17 |
| Distinct categories | 4 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Private room |
|---|---|
| 2nd row | Entire home/apt |
| 3rd row | Private room |
| 4th row | Entire home/apt |
| 5th row | Entire home/apt |
| Value | Count | Frequency (%) |
| Entire home/apt | 23252 | |
| Private room | 22036 | |
| Shared room | 1140 | 2.5% |
| Value | Count | Frequency (%) |
| home/apt | 23252 | |
| entire | 23252 | |
| room | 23176 | |
| private | 22036 | |
| shared | 1140 | 1.2% |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 69680 | |
| r | 69604 | |
| o | 69604 | |
| t | 68540 | |
| a | 46428 | 7.4% |
| 46428 | 7.4% | |
| m | 46428 | 7.4% |
| i | 45288 | 7.2% |
| h | 24392 | 3.9% |
| E | 23252 | 3.7% |
| Other values (7) | 116108 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 509644 | |
| Uppercase Letter | 46428 | 7.4% |
| Space Separator | 46428 | 7.4% |
| Other Punctuation | 23252 | 3.7% |
Most frequent character per category
| Value | Count | Frequency (%) |
| e | 69680 | |
| r | 69604 | |
| o | 69604 | |
| t | 68540 | |
| a | 46428 | |
| m | 46428 | |
| i | 45288 | |
| h | 24392 | 4.8% |
| n | 23252 | 4.6% |
| p | 23252 | 4.6% |
| Other values (2) | 23176 | 4.5% |
| Value | Count | Frequency (%) |
| E | 23252 | |
| P | 22036 | |
| S | 1140 | 2.5% |
| Value | Count | Frequency (%) |
| 46428 |
| Value | Count | Frequency (%) |
| / | 23252 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 556072 | |
| Common | 69680 | 11.1% |
Most frequent character per script
| Value | Count | Frequency (%) |
| e | 69680 | |
| r | 69604 | |
| o | 69604 | |
| t | 68540 | |
| a | 46428 | |
| m | 46428 | |
| i | 45288 | |
| h | 24392 | 4.4% |
| E | 23252 | 4.2% |
| n | 23252 | 4.2% |
| Other values (5) | 69604 |
| Value | Count | Frequency (%) |
| 46428 | ||
| / | 23252 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 625752 |
Most frequent character per block
| Value | Count | Frequency (%) |
| e | 69680 | |
| r | 69604 | |
| o | 69604 | |
| t | 68540 | |
| a | 46428 | 7.4% |
| 46428 | 7.4% | |
| m | 46428 | 7.4% |
| i | 45288 | 7.2% |
| h | 24392 | 3.9% |
| E | 23252 | 3.7% |
| Other values (7) | 116108 |
price
Real number (ℝ≥0)
| Distinct | 337 |
|---|---|
| Distinct (%) | 0.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 122.5380159 |
|---|---|
| Minimum | 10 |
| Maximum | 350 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 362.8 KiB |
Quantile statistics
| Minimum | 10 |
|---|---|
| 5-th percentile | 40 |
| Q1 | 65 |
| median | 100 |
| Q3 | 160 |
| 95-th percentile | 275 |
| Maximum | 350 |
| Range | 340 |
| Interquartile range (IQR) | 95 |
Descriptive statistics
| Standard deviation | 71.86258104 |
|---|---|
| Coefficient of variation (CV) | 0.5864513191 |
| Kurtosis | 0.5184784007 |
| Mean | 122.5380159 |
| Median Absolute Deviation (MAD) | 44 |
| Skewness | 1.030664416 |
| Sum | 5689195 |
| Variance | 5164.230553 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 100 | 2051 | 4.4% |
| 150 | 2047 | 4.4% |
| 50 | 1534 | 3.3% |
| 60 | 1458 | 3.1% |
| 200 | 1401 | 3.0% |
| 75 | 1370 | 3.0% |
| 80 | 1272 | 2.7% |
| 65 | 1190 | 2.6% |
| 70 | 1170 | 2.5% |
| 120 | 1130 | 2.4% |
| Other values (327) | 31805 |
| Value | Count | Frequency (%) |
| 10 | 17 | |
| 11 | 3 | < 0.1% |
| 12 | 4 | < 0.1% |
| 13 | 1 | < 0.1% |
| 15 | 6 | < 0.1% |
| Value | Count | Frequency (%) |
| 350 | 381 | |
| 349 | 45 | 0.1% |
| 348 | 3 | < 0.1% |
| 347 | 4 | < 0.1% |
| 346 | 2 | < 0.1% |
| Distinct | 107 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 6.943180839 |
|---|---|
| Minimum | 1 |
| Maximum | 1250 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 362.8 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 1 |
| median | 2 |
| Q3 | 5 |
| 95-th percentile | 30 |
| Maximum | 1250 |
| Range | 1249 |
| Interquartile range (IQR) | 4 |
Descriptive statistics
| Standard deviation | 19.8775096 |
|---|---|
| Coefficient of variation (CV) | 2.86288231 |
| Kurtosis | 873.9314378 |
| Mean | 6.943180839 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 21.79076237 |
| Sum | 322358 |
| Variance | 395.1153878 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 12148 | |
| 2 | 11199 | |
| 3 | 7506 | |
| 30 | 3534 | 7.6% |
| 4 | 3106 | 6.7% |
| 5 | 2854 | 6.1% |
| 7 | 1975 | 4.3% |
| 6 | 694 | 1.5% |
| 14 | 543 | 1.2% |
| 10 | 464 | 1.0% |
| Other values (97) | 2405 | 5.2% |
| Value | Count | Frequency (%) |
| 1 | 12148 | |
| 2 | 11199 | |
| 3 | 7506 | |
| 4 | 3106 | 6.7% |
| 5 | 2854 | 6.1% |
| Value | Count | Frequency (%) |
| 1250 | 1 | < 0.1% |
| 999 | 3 | |
| 500 | 5 | |
| 480 | 1 | < 0.1% |
| 400 | 1 | < 0.1% |
| Distinct | 393 |
|---|---|
| Distinct (%) | 0.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 23.82771173 |
|---|---|
| Minimum | 0 |
| Maximum | 629 |
| Zeros | 9182 |
| Zeros (%) | 19.8% |
| Memory size | 362.8 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 1 |
| median | 5 |
| Q3 | 24 |
| 95-th percentile | 116 |
| Maximum | 629 |
| Range | 629 |
| Interquartile range (IQR) | 23 |
Descriptive statistics
| Standard deviation | 45.190521 |
|---|---|
| Coefficient of variation (CV) | 1.896553119 |
| Kurtosis | 18.94434683 |
| Mean | 23.82771173 |
| Median Absolute Deviation (MAD) | 5 |
| Skewness | 3.640349972 |
| Sum | 1106273 |
| Variance | 2042.183188 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 9182 | |
| 1 | 4976 | 10.7% |
| 2 | 3318 | 7.1% |
| 3 | 2390 | 5.1% |
| 4 | 1918 | 4.1% |
| 5 | 1526 | 3.3% |
| 6 | 1297 | 2.8% |
| 7 | 1130 | 2.4% |
| 8 | 1080 | 2.3% |
| 9 | 924 | 2.0% |
| Other values (383) | 18687 |
| Value | Count | Frequency (%) |
| 0 | 9182 | |
| 1 | 4976 | |
| 2 | 3318 | 7.1% |
| 3 | 2390 | 5.1% |
| 4 | 1918 | 4.1% |
| Value | Count | Frequency (%) |
| 629 | 1 | |
| 607 | 1 | |
| 597 | 1 | |
| 594 | 1 | |
| 576 | 1 |
| Distinct | 1754 |
|---|---|
| Distinct (%) | 4.7% |
| Missing | 9182 |
| Missing (%) | 19.8% |
| Memory size | 362.8 KiB |
| Minimum | 2011-03-28 00:00:00 |
|---|---|
| Maximum | 2019-07-08 00:00:00 |
| Distinct | 936 |
|---|---|
| Distinct (%) | 2.5% |
| Missing | 9182 |
| Missing (%) | 19.8% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.377473017 |
|---|---|
| Minimum | 0.01 |
| Maximum | 58.5 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 362.8 KiB |
Quantile statistics
| Minimum | 0.01 |
|---|---|
| 5-th percentile | 0.04 |
| Q1 | 0.19 |
| median | 0.715 |
| Q3 | 2.02 |
| 95-th percentile | 4.67 |
| Maximum | 58.5 |
| Range | 58.49 |
| Interquartile range (IQR) | 1.83 |
Descriptive statistics
| Standard deviation | 1.690493397 |
|---|---|
| Coefficient of variation (CV) | 1.227242476 |
| Kurtosis | 43.0992735 |
| Mean | 1.377473017 |
| Median Absolute Deviation (MAD) | 0.615 |
| Skewness | 3.154919504 |
| Sum | 51305.36 |
| Variance | 2.857767925 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 0.02 | 886 | 1.9% |
| 0.05 | 856 | 1.8% |
| 1 | 828 | 1.8% |
| 0.03 | 772 | 1.7% |
| 0.04 | 639 | 1.4% |
| 0.16 | 638 | 1.4% |
| 0.08 | 580 | 1.2% |
| 0.09 | 564 | 1.2% |
| 0.06 | 557 | 1.2% |
| 0.11 | 527 | 1.1% |
| Other values (926) | 30399 | |
| (Missing) | 9182 | 19.8% |
| Value | Count | Frequency (%) |
| 0.01 | 40 | 0.1% |
| 0.02 | 886 | |
| 0.03 | 772 | |
| 0.04 | 639 | |
| 0.05 | 856 |
| Value | Count | Frequency (%) |
| 58.5 | 1 | |
| 27.95 | 1 | |
| 20.94 | 1 | |
| 19.75 | 1 | |
| 17.82 | 1 |
calculated_host_listings_count
Real number (ℝ≥0)
| Distinct | 47 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 6.672503662 |
|---|---|
| Minimum | 1 |
| Maximum | 327 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 362.8 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 1 |
| median | 1 |
| Q3 | 2 |
| 95-th percentile | 13 |
| Maximum | 327 |
| Range | 326 |
| Interquartile range (IQR) | 1 |
Descriptive statistics
| Standard deviation | 31.0834363 |
|---|---|
| Coefficient of variation (CV) | 4.658436754 |
| Kurtosis | 75.60395988 |
| Mean | 6.672503662 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 8.350507449 |
| Sum | 309791 |
| Variance | 966.180012 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 30677 | |
| 2 | 6436 | 13.9% |
| 3 | 2745 | 5.9% |
| 4 | 1354 | 2.9% |
| 5 | 808 | 1.7% |
| 6 | 529 | 1.1% |
| 8 | 396 | 0.9% |
| 7 | 390 | 0.8% |
| 327 | 272 | 0.6% |
| 9 | 225 | 0.5% |
| Other values (37) | 2596 | 5.6% |
| Value | Count | Frequency (%) |
| 1 | 30677 | |
| 2 | 6436 | 13.9% |
| 3 | 2745 | 5.9% |
| 4 | 1354 | 2.9% |
| 5 | 808 | 1.7% |
| Value | Count | Frequency (%) |
| 327 | 272 | |
| 232 | 192 | |
| 121 | 98 | 0.2% |
| 103 | 103 | 0.2% |
| 96 | 186 |
| Distinct | 366 |
|---|---|
| Distinct (%) | 0.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 109.6768545 |
|---|---|
| Minimum | 0 |
| Maximum | 365 |
| Zeros | 17005 |
| Zeros (%) | 36.6% |
| Memory size | 362.8 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 40 |
| Q3 | 217 |
| 95-th percentile | 358 |
| Maximum | 365 |
| Range | 365 |
| Interquartile range (IQR) | 217 |
Descriptive statistics
| Standard deviation | 130.4139522 |
|---|---|
| Coefficient of variation (CV) | 1.189074512 |
| Kurtosis | -0.9227839215 |
| Mean | 109.6768545 |
| Median Absolute Deviation (MAD) | 40 |
| Skewness | 0.8064281257 |
| Sum | 5092077 |
| Variance | 17007.79893 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 17005 | |
| 365 | 1122 | 2.4% |
| 364 | 430 | 0.9% |
| 1 | 397 | 0.9% |
| 89 | 334 | 0.7% |
| 5 | 333 | 0.7% |
| 3 | 296 | 0.6% |
| 179 | 273 | 0.6% |
| 90 | 270 | 0.6% |
| 2 | 254 | 0.5% |
| Other values (356) | 25714 |
| Value | Count | Frequency (%) |
| 0 | 17005 | |
| 1 | 397 | 0.9% |
| 2 | 254 | 0.5% |
| 3 | 296 | 0.6% |
| 4 | 227 | 0.5% |
| Value | Count | Frequency (%) |
| 365 | 1122 | |
| 364 | 430 | 0.9% |
| 363 | 215 | 0.5% |
| 362 | 150 | 0.3% |
| 361 | 101 | 0.2% |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.First rows
| df_index | id | name | host_id | host_name | neighbourhood_group | neighbourhood | latitude | longitude | room_type | price | minimum_nights | number_of_reviews | last_review | reviews_per_month | calculated_host_listings_count | availability_365 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0 | 2539 | Clean & quiet apt home by the park | 2787 | John | Brooklyn | Kensington | 40.64749 | -73.97237 | Private room | 149 | 1 | 9 | 2018-10-19 | 0.21 | 6 | 365 |
| 1 | 1 | 2595 | Skylit Midtown Castle | 2845 | Jennifer | Manhattan | Midtown | 40.75362 | -73.98377 | Entire home/apt | 225 | 1 | 45 | 2019-05-21 | 0.38 | 2 | 355 |
| 2 | 2 | 3647 | THE VILLAGE OF HARLEM....NEW YORK ! | 4632 | Elisabeth | Manhattan | Harlem | 40.80902 | -73.94190 | Private room | 150 | 3 | 0 | NaT | NaN | 1 | 365 |
| 3 | 3 | 3831 | Cozy Entire Floor of Brownstone | 4869 | LisaRoxanne | Brooklyn | Clinton Hill | 40.68514 | -73.95976 | Entire home/apt | 89 | 1 | 270 | 2019-07-05 | 4.64 | 1 | 194 |
| 4 | 4 | 5022 | Entire Apt: Spacious Studio/Loft by central park | 7192 | Laura | Manhattan | East Harlem | 40.79851 | -73.94399 | Entire home/apt | 80 | 10 | 9 | 2018-11-19 | 0.10 | 1 | 0 |
| 5 | 5 | 5099 | Large Cozy 1 BR Apartment In Midtown East | 7322 | Chris | Manhattan | Murray Hill | 40.74767 | -73.97500 | Entire home/apt | 200 | 3 | 74 | 2019-06-22 | 0.59 | 1 | 129 |
| 6 | 6 | 5121 | BlissArtsSpace! | 7356 | Garon | Brooklyn | Bedford-Stuyvesant | 40.68688 | -73.95596 | Private room | 60 | 45 | 49 | 2017-10-05 | 0.40 | 1 | 0 |
| 7 | 7 | 5178 | Large Furnished Room Near B'way | 8967 | Shunichi | Manhattan | Hell's Kitchen | 40.76489 | -73.98493 | Private room | 79 | 2 | 430 | 2019-06-24 | 3.47 | 1 | 220 |
| 8 | 8 | 5203 | Cozy Clean Guest Room - Family Apt | 7490 | MaryEllen | Manhattan | Upper West Side | 40.80178 | -73.96723 | Private room | 79 | 2 | 118 | 2017-07-21 | 0.99 | 1 | 0 |
| 9 | 9 | 5238 | Cute & Cozy Lower East Side 1 bdrm | 7549 | Ben | Manhattan | Chinatown | 40.71344 | -73.99037 | Entire home/apt | 150 | 1 | 160 | 2019-06-09 | 1.33 | 4 | 188 |
Last rows
| df_index | id | name | host_id | host_name | neighbourhood_group | neighbourhood | latitude | longitude | room_type | price | minimum_nights | number_of_reviews | last_review | reviews_per_month | calculated_host_listings_count | availability_365 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 46418 | 48885 | 36482809 | Stunning Bedroom NYC! Walking to Central Park!! | 131529729 | Kendall | Manhattan | East Harlem | 40.79633 | -73.93605 | Private room | 75 | 2 | 0 | NaT | NaN | 2 | 353 |
| 46419 | 48886 | 36483010 | Comfy 1 Bedroom in Midtown East | 274311461 | Scott | Manhattan | Midtown | 40.75561 | -73.96723 | Entire home/apt | 200 | 6 | 0 | NaT | NaN | 1 | 176 |
| 46420 | 48887 | 36483152 | Garden Jewel Apartment in Williamsburg New York | 208514239 | Melki | Brooklyn | Williamsburg | 40.71232 | -73.94220 | Entire home/apt | 170 | 1 | 0 | NaT | NaN | 3 | 365 |
| 46421 | 48888 | 36484087 | Spacious Room w/ Private Rooftop, Central location | 274321313 | Kat | Manhattan | Hell's Kitchen | 40.76392 | -73.99183 | Private room | 125 | 4 | 0 | NaT | NaN | 1 | 31 |
| 46422 | 48889 | 36484363 | QUIT PRIVATE HOUSE | 107716952 | Michael | Queens | Jamaica | 40.69137 | -73.80844 | Private room | 65 | 1 | 0 | NaT | NaN | 2 | 163 |
| 46423 | 48890 | 36484665 | Charming one bedroom - newly renovated rowhouse | 8232441 | Sabrina | Brooklyn | Bedford-Stuyvesant | 40.67853 | -73.94995 | Private room | 70 | 2 | 0 | NaT | NaN | 2 | 9 |
| 46424 | 48891 | 36485057 | Affordable room in Bushwick/East Williamsburg | 6570630 | Marisol | Brooklyn | Bushwick | 40.70184 | -73.93317 | Private room | 40 | 4 | 0 | NaT | NaN | 2 | 36 |
| 46425 | 48892 | 36485431 | Sunny Studio at Historical Neighborhood | 23492952 | Ilgar & Aysel | Manhattan | Harlem | 40.81475 | -73.94867 | Entire home/apt | 115 | 10 | 0 | NaT | NaN | 1 | 27 |
| 46426 | 48893 | 36485609 | 43rd St. Time Square-cozy single bed | 30985759 | Taz | Manhattan | Hell's Kitchen | 40.75751 | -73.99112 | Shared room | 55 | 1 | 0 | NaT | NaN | 6 | 2 |
| 46427 | 48894 | 36487245 | Trendy duplex in the very heart of Hell's Kitchen | 68119814 | Christophe | Manhattan | Hell's Kitchen | 40.76404 | -73.98933 | Private room | 90 | 7 | 0 | NaT | NaN | 1 | 23 |